feat: auto-detect proxy and translate peer addresses to contact point by fruch · Pull Request #153 · scylladb/cqlsh-rs

fruch · 2026-05-04T18:38:39Z

Summary

Adds ProxyAddressTranslator that redirects all peer connections to the original contact point, enabling connections through proxies/load balancers (AWS NLB, PrivateLink, etc.)
Automatically installed on every connection — no user configuration needed
Includes 5 unit tests and 3 integration tests (including a socat-based proxy simulation)

Problem

When connecting through a proxy, the driver discovers internal node IPs from system.peers that are unreachable from the client, causing Connection error: No connections in the pool.

Solution

The scylla-rust-driver's AddressTranslator trait translates peer addresses discovered from system.peers. Since known_node addresses (the contact point) are never translated, we can safely always install a translator that redirects all peer addresses to the contact point.

Equivalent to scylladb/python-driver#833 (DynamicWhiteListRoundRobinPolicy) but using the rust driver's native AddressTranslator mechanism.

Testing

cargo test --lib proxy_address_translator — 5 unit tests
Integration tests with socat TCP proxy
Manually verified against proxy endpoint 18.208.144.200

codecov · 2026-05-04T18:43:27Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.74%. Comparing base (e62a853) to head (b26f30b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #153      +/-   ##
==========================================
+ Coverage   55.64%   55.74%   +0.09%     
==========================================
  Files          21       22       +1     
  Lines        4868     4879      +11     
==========================================
+ Hits         2709     2720      +11     
  Misses       2159     2159

Flag	Coverage Δ
integration-cassandra-4.1	`18.17% <100.00%> (+0.18%)`	⬆️
integration-cassandra-5.0	`18.17% <100.00%> (+0.18%)`	⬆️
integration-scylladb-2025.1	`20.55% <100.00%> (+0.18%)`	⬆️
integration-scylladb-2026.1	`20.55% <100.00%> (+0.18%)`	⬆️
integration-ssl-auth	`0.04% <0.00%> (-0.01%)`	⬇️
unittests	`43.41% <9.09%> (-0.08%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-05-04T18:43:52Z

CI Summary

Status: ✅ All jobs passed

✅ Rustfmt — passed

No issues.

✅ Clippy — passed

No issues.

✅ Tests — passed

No issues.

✅ Build — passed

No issues.

🤖 Generated by CI Summary • Full logs

fruch · 2026-05-04T20:17:36Z

@dkropachev @Lorak-mmk is the right way to use the rust driver ? equivalent to scylladb/python-driver#833

Lorak-mmk · 2026-05-05T09:47:38Z

+/// An [`AddressTranslator`] that redirects all peer connections to the original
+/// contact point address. Used when the cluster is accessed through a proxy.
+///
+/// All discovered node addresses are translated to `proxy_address`, ensuring
+/// the driver only connects through the proxy endpoint.
+#[derive(Debug, Clone)]
+pub struct ProxyAddressTranslator {
+    /// The proxy/contact point address to route all connections through.
+    proxy_address: SocketAddr,
+}
+
+impl ProxyAddressTranslator {
+    /// Create a new translator that routes all connections to `proxy_address`.
+    pub fn new(proxy_address: SocketAddr) -> Self {
+        Self { proxy_address }
+    }
+
+    /// Returns the proxy address this translator routes to.
+    pub fn proxy_address(&self) -> SocketAddr {
+        self.proxy_address
+    }
+}
+
+#[async_trait]
+impl AddressTranslator for ProxyAddressTranslator {
+    async fn translate_address(
+        &self,
+        _untranslated_peer: &UntranslatedPeer,
+    ) -> Result<SocketAddr, TranslationError> {
+        Ok(self.proxy_address)
+    }
+}
+


I'm not sure if this achieves what you want. I assume the proxy address leads to one specific node.
Let's say you have 5 nodes in the cluster, each with 32 shards. With your changes, the driver will still see all 5 nodes in system.peers/local, try to open connection pools to all 5 nodes. The address translation will cause all connections to be opened to the same node (driver won't even know about it), so you'll get 160 connections to this node.
You could use PoolSize::PerNode(1) (which is a good idea in cqlsh regardless of all other changes) to get this down to 5.
Then the connection amount problem is not that bad, but you are still in a very weird state where driver thinks it opened pools to all nodes, but they are really all to one node. Will this work correctly? It may, I'm not completely sure.
TBH I don't know how to solve that will existing APIs. There isn't really a way to implement a HostFilter that would filter out other nodes, because HostFilter accepts Peer, which has an address fetched from system.peers or system.local - so you can't really say for sure if its the same peer as the contact point.

What would be nice here is a simplified session, with a separate builder, where driver only opens a single connection to the given address, and uses it both as CC and to execute user requests. This would also work for the maintenance socket. cc @wprzytula - let's discuss this when we meet, there are some not obvious decisions when implementing such session.

Agreed on the connection count concern. Added PoolSize::PerHost(1) — cqlsh is a single-user interactive tool so one connection per node is plenty. This brings it down to N connections (one per discovered node) all going through the proxy.

Re: the simplified single-connection session — that would be ideal. For now this is a pragmatic workaround. Happy to migrate when that API exists.

Makes sense. This is a weird state for the driver to be in, but I think it should work. @wprzytula will be available tomorrow if you want him to also take a look (maybe I am missing some potential problem).

I agree with your thoughs above @Lorak-mmk. Aside from that, maybe let's also use SingleTargetLoadBalancingPolicy to simplify the load balancing part?

This will be difficult for the same reason host filtering is difficult. You would need to know which node you have CC opened to, and it can no longer be done by socket address.

BTW, the code in this PR was proven to be working as expected in the case the broadcast address isn't available (AWS public / private addresses)

Anyhow I'm going to merge this one, so it won't regress the expected beahvier from the python cqlsh (before the driver change that broke it)

Lorak-mmk · 2026-05-05T09:48:14Z

        Ok(ScyllaDriver {
            session,
            prepared_cache: Mutex::new(HashMap::new()),
            consistency: Mutex::new(Consistency::One),


Have you considered using our CachingSession instead of implementing cache yourself?

first time I'm hearing about CachingSession, I don't think python have the equivalent, we'll check it out

Opened #164 to track this. The current CqlDriver trait separates prepare/execute-by-id, so it's not a trivial swap — but worth doing as a follow-up refactor.

first time I'm hearing about CachingSession, I don't think python have the equivalent, we'll check it out

scylladb/scylla-rust-driver#333 is open for nearly 5 years now...

When connecting through a proxy/load balancer (e.g., AWS NLB, PrivateLink), the driver discovers internal node IPs from system.peers that are unreachable from the client. This installs a ProxyAddressTranslator that redirects all peer connections to the original contact point address. Since known_node addresses are never translated by the scylla driver (only peer addresses from system.peers are), this is safe for both direct and proxy connections.

- Resolve DNS hostnames via tokio::net::lookup_host instead of addr.parse::<SocketAddr>() which silently fails for domain names - Add PoolSize::PerHost(1) to limit connections per node (cqlsh is single-user; also mitigates connection explosion through proxy) - Remove unused detect_proxy() function and proxy_address() getter - Trim module docs to match actual always-install strategy Refs #164

github-actions

Benchmark

Details

Benchmark suite	Current: `b26f30b`	Previous: `e62a853`	Ratio
`cli_parse_args/no_args`	`29230` ns	`29467` ns	`0.99`
`cli_validate/valid_full`	`2` ns	`3` ns	`0.67`
`cqlshrc_parse/empty`	`3759` ns	`3622` ns	`1.04`
`cqlshrc_parse/minimal`	`10410` ns	`9615` ns	`1.08`
`cqlshrc_parse/full`	`68033` ns	`66240` ns	`1.03`
`config_merge/full_merge`	`875` ns	`902` ns	`0.97`
`end_to_end_startup/full`	`150280` ns	`153420` ns	`0.98`
`parse_multiline/6_lines`	`7394` ns	`7588` ns	`0.97`
`classify_input/empty`	`13` ns	`14` ns	`0.93`
`format_table/rows/10`	`85706` ns	`86826` ns	`0.99`
`format_table/rows/100`	`767400` ns	`740640` ns	`1.04`
`format_table/rows/1000`	`7640700` ns	`7416000` ns	`1.03`
`format_expanded/rows/10`	`10212` ns	`10184` ns	`1.00`
`format_json_100`	`48732` ns	`50118` ns	`0.97`
`format_csv_100`	`38780` ns	`39539` ns	`0.98`

This comment was automatically generated by workflow using github-action-benchmark.

fruch force-pushed the feat/proxy-auto-detect branch from 44508e7 to b2b55d7 Compare May 4, 2026 18:44

fruch requested review from Lorak-mmk and dkropachev May 4, 2026 20:16

fruch force-pushed the feat/proxy-auto-detect branch from b2b55d7 to 2e38ceb Compare May 5, 2026 06:55

Lorak-mmk reviewed May 5, 2026

View reviewed changes

fruch mentioned this pull request May 5, 2026

refactor: replace manual prepared_cache with CachingSession from scylla-rust-driver #164

Open

fruch requested review from Lorak-mmk and wprzytula May 7, 2026 14:15

fruch added 2 commits May 7, 2026 17:16

fruch force-pushed the feat/proxy-auto-detect branch from 76bf605 to b26f30b Compare May 7, 2026 14:16

fruch added performance Performance-related changes benchmark Run benchmark suite on this PR and removed performance Performance-related changes labels May 7, 2026

github-actions Bot reviewed May 7, 2026

View reviewed changes

fruch merged commit 7aae5cf into main May 9, 2026
30 checks passed

Conversation

fruch commented May 4, 2026

Summary

Problem

Solution

Testing

Uh oh!

codecov Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI Summary

Uh oh!

fruch commented May 4, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Benchmark

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading

github-actions Bot left a comment •

edited

Loading